GAPS: A clustering method using a new point symmetry-based distance measure

نویسندگان

  • Sanghamitra Bandyopadhyay
  • Sriparna Saha
چکیده

In this paper, an evolutionary clustering technique is described that uses a new point symmetry-based distance measure. The algorithm is therefore able to detect both convex and non-convex clusters. Kd-tree based nearest neighbor search is used to reduce the complexity of finding the closest symmetric point. Adaptive mutation and crossover probabilities are used. The proposed GA with point symmetry (GAPS) distance based clustering algorithm is able to detect any type of clusters, irrespective of their geometrical shape and overlapping nature, as long as they possess the characteristic of symmetry. GAPS is compared with existing symmetry-based clustering technique SBKM, its modified version, and the well-known K-means algorithm. Sixteen data sets with widely varying characteristics are used to demonstrate its superiority. For real-life data sets, ANOVA and MANOVA statistical analyses are performed. 2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Point Symmetry Based Distance and its Application to Clustering

In this paper, an evolutionary clustering technique is described that uses a point symmetry based distance measure. The algorithm is therefore able to detect both convex and non-convex clusters. Kd-tree based nearest neighbor search is used to reduce the complexity of finding the closest symmetric point. The proposed GA with point symmetry distance based (GAPS) clustering technique is compared ...

متن کامل

تقویت محور مرکزی ساختارهای لوله‌ای و کاربرد آن در استخراج محور مرکزی سیاهرگ پورتال

I In this paper, a new filter is designed to enhance medial-axis of tubular structures. Based on a multi-scale method and using eigenvectors of Hessian matrix, the distance of a point to the edges of the tube is found. To do this, a hypothetical line with a deliberate direction is passed through the point which cuts the tube at its edges. For points which are located on the medial-axis, this di...

متن کامل

خوشه‌بندی داده‌های بیان‌ژنی توسط عدم تشابه جنگل تصادفی

Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...

متن کامل

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2007